Picking the right guitar to purchase

Adam Maghout

Methodology & Statistics @ UU University

25 Sep 2024

Data used

To measure which genres are popular to play, we will analyse a dataset concerning Rolling Stone’s 500 Greatest Albums of All Time from 2012. This list is revised every decade so a newer list is available but has not been generously turned into a manipulable dataset and published on data.world like the 2012 list.

We must first load this dataset into R, which can be executed silently. We will also load useful packages at the same time.

Exploring the data

There is a variety of subgenres present in the loaded dataset. In a first time, it might be interesting to simply explore how songs were classified. Since these were added to the dataset using a python package, this will also help identify potential misclassifications.

What about specific artists or gender?

There aren’t many repeat artists in the data, and gender is not given so not much can be done here without a larger and more detailed dataset. The code in R for obtaining counts for artists and gender in such a situation is however presented out of interest.

artist_counts <- RollingStone %>%
  count(Artist, sort = TRUE)
  
gender_counts <- RollingStone %>%
  count(Gender, sort = TRUE)